{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "批量归一化的提出正是为了应对深度模型训练的挑战。在模型训练时，批量归一化利用小批量上的均值和标准差，不断调整神经网络中间输出，从而使整个神经网络在各层的中间输出的数值更稳定。**批量归一化和下一节将要介绍的残差网络为训练和设计深度模型提供了两类重要思路**。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch\n",
    "import sys\n",
    "sys.path.append(\"..\") "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "tensor([[2.6667]])\n",
      "tensor([1.0000, 2.6667, 3.0000, 4.0000])\n"
     ]
    }
   ],
   "source": [
    "X = torch.randn(2,3,32,32)\n",
    "mean = X.mean(dim=0, keepdim=True).mean(dim=2, keepdim=True).mean(dim=3, keepdim=True)\n",
    "\n",
    "X =torch.tensor([[1,3,3,4], [1,2,3,4.0], [1,3,3,4]])\n",
    "print(X.mean(dim = 0, keepdim = True).mean(dim=1, keepdim=True))\n",
    "print(X.mean(dim = 0))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Pytorch batchNorm"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "import torch.nn as nn\n",
    "import dl_utils\n",
    "net = nn.Sequential(\n",
    "            nn.Conv2d(1, 6, 5), # in_channels, out_channels, kernel_size\n",
    "            nn.BatchNorm2d(6),\n",
    "            nn.Sigmoid(),\n",
    "            nn.MaxPool2d(2, 2), # kernel_size, stride\n",
    "            nn.Conv2d(6, 16, 5),\n",
    "            nn.BatchNorm2d(16),\n",
    "            nn.Sigmoid(),\n",
    "            nn.MaxPool2d(2, 2),\n",
    "            dl_utils.FlattenLayer(),\n",
    "            nn.Linear(16*4*4, 120),\n",
    "            nn.BatchNorm1d(120),\n",
    "            nn.Sigmoid(),\n",
    "            nn.Linear(120, 84),\n",
    "            nn.BatchNorm1d(84),\n",
    "            nn.Sigmoid(),\n",
    "            nn.Linear(84, 10)\n",
    "        )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "training on  cuda\n",
      "epoch 1, loss 0.2503, train acc 0.909, test acc 0.838, time 9.7 sec\n",
      "epoch 2, loss 0.1216, train acc 0.912, test acc 0.821, time 8.9 sec\n",
      "epoch 3, loss 0.0789, train acc 0.914, test acc 0.704, time 9.1 sec\n",
      "epoch 4, loss 0.0577, train acc 0.917, test acc 0.842, time 9.0 sec\n",
      "epoch 5, loss 0.0451, train acc 0.919, test acc 0.686, time 8.7 sec\n"
     ]
    }
   ],
   "source": [
    "batch_size = 256\n",
    "train_iter, test_iter = dl_utils.load_data_fashion_mnist(batch_size=batch_size)\n",
    "device = 'cuda'\n",
    "\n",
    "torch.backends.cudnn.benchmark = True\n",
    "torch.backends.cudnn.deterministic = False\n",
    "torch.backends.cudnn.enabled = True\n",
    "\n",
    "lr, num_epochs = 0.001, 5\n",
    "optimizer = torch.optim.Adam(net.parameters(), lr=lr)\n",
    "dl_utils.train_ch5(net, train_iter, test_iter, batch_size, optimizer, device, num_epochs)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "training on  cuda\n",
    "epoch 1, loss 0.2957, train acc 0.897, test acc 0.835, time 9.1 sec\n",
    "epoch 2, loss 0.1415, train acc 0.900, test acc 0.802, time 9.1 sec\n",
    "epoch 3, loss 0.0904, train acc 0.904, test acc 0.857, time 9.1 sec\n",
    "epoch 4, loss 0.0660, train acc 0.906, test acc 0.801, time 9.2 sec\n",
    "epoch 5, loss 0.0514, train acc 0.908, test acc 0.805, time 9.4 sec"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 小结\n",
    "\n",
    "- 在模型训练时，批量归一化利用小批量上的均值和标准差，不断调整神经网络的中间输出，从而使整个神经网络在各层的中间输出的数值更稳定。\n",
    "- 对全连接层和卷积层做批量归一化的方法稍有不同。\n",
    "- 批量归一化层和丢弃层一样，在训练模式和预测模式的计算结果是不一样的。\n",
    "- PyTorch提供了BatchNorm类方便使用。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
